Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: read team token from kafka headers in blobby #18337

Merged
merged 21 commits into from
Nov 8, 2023

Conversation

pauldambra
Copy link
Member

@pauldambra pauldambra commented Nov 2, 2023

Recently we read tens of millions of messages knowing the only action was to drop them. For every message we had to parse the body to get the token so we could check if we wanted to drop it.

Let's put the team token into the kafka headers since if there's an incident we always care about those values.

And then we can read the team token from the header and drop the message without parsing it when the team does not have session recording enabled.

This still also reads the body to check for team token as a fallback. If nothing else this lets us deploy this without worrying about rolling out a switcheroo

So

token in headers, team not found - return early (new behaviour)
token in headers, team found - does parse the body, but doesn't check the team again
token not in headers, team not found - return early (as before)
token not in headers, team found - (as before)

@pauldambra pauldambra mentioned this pull request Nov 4, 2023
@pauldambra pauldambra marked this pull request as ready for review November 7, 2023 07:49
Copy link
Contributor

@benjackwhite benjackwhite left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks great. Was going to comment on how will we know if it works but i guess thats the distinction of the drop_cause ?

@@ -217,7 +217,7 @@ export class SessionManager {
return
}

const lagMultiplier = getLagMultipler(partitionLag)
const lagMultiplier = getLagMultiplier(partitionLag)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Multipler is the formal version of the word... 🙈

@pauldambra
Copy link
Member Author

Was going to comment on how will we know if it works but i guess thats the distinction of the drop_cause

yep, I made that more explicit so we'll see no more of the existing drop cause and one or other of the new ones 👍


// NOTE: This is simple validation - ideally we should do proper schema based validation
if (event.event !== '$snapshot_items' || !$snapshot_items || !$session_id) {
status.warn('🙈', 'Received non-snapshot message, ignoring')
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

let's increment eventDroppedCounter too here for visibility. Ideally, I'd love for all drop scenarios to increment this metric.


return statusWarn('header_token_team_missing_or_disabled', {
token: token,
})
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have an opportunity to drop here if !teamIdWithConfig.session_recording_opt_in, don't we?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's implicit in this check. We only load teams that exist and have session_recording_opt_in set to true.

I guess the assumption was it wasn't valuable enough to know the difference in the event dropped counter to stand the cost of loading more teams...

@pauldambra pauldambra merged commit cf4ba98 into master Nov 8, 2023
65 checks passed
@pauldambra pauldambra deleted the feat/read-token-from-headers branch November 8, 2023 14:59
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants